feat: Docker Compose local test infrastructure with seed data#24
feat: Docker Compose local test infrastructure with seed data#24
Conversation
…tamp edge cases Add type-class validation to TableOp.schema and TableOp.type so that unions, intersects, and minus operations reject mismatched column types early with a clear QueryBuilderError instead of silently producing incorrect results. (#5) Fix PostgreSQL timestamp normalization: use timestamptz(6) cast for TimestampTZ columns to preserve timezone info during bounds comparison, and replace hardcoded TIMESTAMP_PRECISION_POS with length()-based calculation to correctly pad years with >4 digits. (#12) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add SQL seed data (PostgreSQL + MySQL) with ~1000 rows and deliberate diffs for showcasing data-diff. Default connection strings for all docker-compose databases, add profiles to keep lightweight default (PG + MySQL only), and add Makefile for developer ergonomics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d0021615d9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
tests/common.py
Outdated
| TEST_VERTICA_CONN_STRING: str = ( | ||
| os.environ.get("DATADIFF_VERTICA_URI") or "vertica://vertica:Password1@localhost:5433/vertica" |
There was a problem hiding this comment.
Keep Vertica tests opt-in by default
This new default URI makes Vertica always appear configured, so tests/common.py includes Vertica in CONN_STRINGS and parameterized integration tests will try to connect to it even when that service is not running; I checked .github/workflows/ci.yml and it only starts mysql postgres presto trino clickhouse, and Makefile test only starts postgres mysql, so both default workflows can now fail with connection errors unless users manually override the env var to disable Vertica.
Useful? React with 👍 / 👎.
Critical: - Fix _add_padding double-truncation regression for rounding branch (split into _truncate_and_pad and _zero_pad for correct behavior) Important: - Fix non-rounding timestamp path to use timestamptz cast for TimestampTZ - Add None guard to TableOp.type to avoid misleading errors - Use QueryBuilderError consistently for schema length mismatch - Revert Presto/Trino/Vertica conn defaults to None (CI doesn't test them) - Remove unused Presto/Trino from CI docker compose command - Add comprehensive tests for all timestamp paths and edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Revert ClickHouse default conn string to None so `make test` skips ClickHouse when the container isn't running; set URI explicitly in CI - Add None-schema guard in TableOp.schema with clear error message - Return None (not optimistic type) when one side of TableOp.type is unknown - Fix Makefile comment to accurately reflect PG + MySQL (not ClickHouse) - Add comment explaining why Join.schema skips cross-table type validation - Add tests for TableOp.type mismatch and matching branches Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add comment explaining why --profile full is needed (ClickHouse is profile-gated; only explicitly named services start) - Add `or None` to Databricks and MsSQL conn strings to handle empty env vars consistently with all other optional databases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
data-diffagainst local containersprofiles: [full]sodocker compose uponly starts PostgreSQL + MySQL (fast <10s startup)tests/common.py— no manual env vars neededmake up,make up-full,make down,make test,make test-unit,make demoDATADIFF_CLICKHOUSE_URIenv override (now defaulted in code)Test plan
make down && make up— verify only PG + MySQL startdocker exec dd-postgresql psql -U postgres -c "SELECT count(*) FROM ratings_source"— should return 1000docker exec dd-mysql mysql -umysql -pPassword1 mysql -e "SELECT count(*) FROM ratings_source"— should return 1000make demo— runs data-diff against seed tables, shows diffsmake test-unit— unit tests pass without any databasemake test— full suite passes against PG + MySQLdocker compose --profile full up -d --wait— all 6 services start🤖 Generated with Claude Code